Data Set Generator: Part 1

This posts deals with creating the initial mass fractions to be used as initial conditions for the source terms ODE’s.

Marlon Cabrera-Ormaza https://example.com/norajones
06-16-2020

Random Initial Mass Fraction Generator

For the training, testing and validation of our ANN we need to generate a proper data set in such a way that the network can perform supervised learning from it, while keeping grounded on the physics of reality.

In lay terms, this means that it would not make sense to train a network to predict values that are not possible in real life. The intended way to generate the data samples is to solve the system of stiff ordinary differential equations and use the evolution steps of the numerical solver as input and label sets. For this we need to provide first a set of random initial conditions (in this case initial mass fractions and temperature) to pass to an ODE solver.

It is important that these randomly generated initial conditions agree with was is physically possible. Mathematically speaking, it is possible to generate an infinite number of initial mass fractions containing all the species involved in a given reaction mechanism, however, the existence of some of these might not be actually plausible since they could not agree with one or more conservation law. In particular we’d like our samples to comply with mass conservation.

One way this is feasible is by using Bilger’s Definition for a mixture.

Setting the work environment

The libraries that will be used for this are:

Analysis of the combustion gas

Before proceeding to generating the set of initial random condition we need to be able to obtain the chemical properties of our gas mixture. The library Cantera is useful in this endeavor since it allows for the reading of information from a reaction mechanism and then provides functionality for interacting with them in Python.

After assigning the properties of a reaction mechanism to an object, Cantera provides functions to use them.


 Elements in the mechanism: ['O', 'H', 'C', 'N', 'Ar'] 
 Species in the mechanism: ['CH4', 'O2', 'H2O', 'N2', 'CO', 'CO2', 'H2'] 
 Number of atoms of Oxygen in Methane: 0.0 
 Number of atoms of Oxygen in Water: 1.0 
 Mass of Hydrogen:  1.00794 
 Mass of Methane:  16.04276

Since noble gases do not interact in any reaction (footnote), the following function is designed to remove them:


Elements without noble gases:  ['O', 'H', 'C', 'N']

Using the functions described above is possible to build one that generates the matrix needes for the linear system of equations to be solved in order for the initial conditions to agree with Bilger’s Definition.

Bilger’s Definition

Using the functions described above is possible to build one that generates the matrix needes for the linear system of equations to be solved in order for the initial conditions to agree with Bilger’s Definition.


[[1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000]
 [0.0000 1.0000 0.8881 0.0000 0.5712 0.7271 0.0000]
 [0.2513 0.0000 0.1119 0.0000 0.0000 0.0000 1.0000]
 [0.7487 0.0000 0.0000 0.0000 0.4288 0.2729 0.0000]
 [0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000]]

As for the resulting vector, it will be created using random mixture fractions of a fuel and an oxydizer. Python dictionaries bestow a simple way to storing this information. For the results to be reproducible, it is necessary to use a seeded random number generator. Nonetheles the implementation of it won’t come until later.


[1.0000 0.1689 0.0692 0.2061 0.5558]

Now is possible to see both the coefficient matrix and result vector needed to solve the linear system Ax=b.


A =

 [[1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000]
 [0.0000 1.0000 0.8881 0.0000 0.5712 0.7271 0.0000]
 [0.2513 0.0000 0.1119 0.0000 0.0000 0.0000 1.0000]
 [0.7487 0.0000 0.0000 0.0000 0.4288 0.2729 0.0000]
 [0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000]]

b.transpose =

 [1.0000 0.1689 0.0692 0.2061 0.5558]

Solving the linear system

Once the system of equations is setted up it is neccesary to solve it for the species mass fractions. Unfortunately, as can be seen, this is an undetermined system (i.e. it has more unknowns thant equations) and thus, in the case it can be solved, this solution will not be unique. To deal with this issue there are two possibilities using either a least-squares solver or a non-negative least squares solver. Both will provide an approximation of the solution by minimizing the squared euclidean norm:

However, they differ in certain aspects, and thus is necessary to analyze the distribution of the samples generated with each in order to select an appropiate method.


        CH4        O2       H2O  ...       CO2        H2   Mixture
0  0.029103  0.074193  0.066647  ...  0.063934  0.006760  0.548814
1  0.049822  0.067301  0.060748  ...  0.066294  0.008738  0.715189
2  0.035821  0.071958  0.064734  ...  0.064699  0.007402  0.602763
3  0.028614  0.074355  0.066786  ...  0.063878  0.006714  0.544883
4  0.013517  0.079377  0.071085  ...  0.062158  0.005273  0.423655

[5 rows x 8 columns]

        CH4        O2  H2O        N2   CO  CO2        H2   Mixture
0  0.085670  0.213038  0.0  0.701289  0.0  0.0  0.000002  0.548814
1  0.111628  0.206991  0.0  0.681377  0.0  0.0  0.000003  0.715189
2  0.094087  0.211078  0.0  0.694832  0.0  0.0  0.000002  0.602763
3  0.085057  0.213181  0.0  0.701759  0.0  0.0  0.000002  0.544883
4  0.066142  0.217588  0.0  0.716268  0.0  0.0  0.000002  0.423655

Distribution of the data samples

CH4

O2

H2O

N2

CO

CO2

H2

Final Solution of the linear system:

Since it was seen that the least squares method generates a lot of negative mass fractions, while the non-negative least squares produces a lot of zeroes an obvious idea would be to combine both methods in such a way that they compliment each other. From observing the scatter graphs of the extant species, one could observe that in most cases, whenever a negative value is created using the least squares solver, the non-negative one returns a zero value. Combined in this way the negative mass fractions are dealt with while avoiding that for certain species, only zero fractions are returned (which is the main problem when using nnls).

Using this approach, it is now possible to define a functions that utilizes both solvers.


        CH4        O2       H2O        N2        CO       CO2        H2
0  0.029103  0.074193  0.066647  0.701289  0.058074  0.063934  0.006760
1  0.049822  0.067301  0.060748  0.681377  0.065719  0.066294  0.008738
2  0.035821  0.071958  0.064734  0.694832  0.060553  0.064699  0.007402
3  0.028614  0.074355  0.066786  0.701759  0.057893  0.063878  0.006714
4  0.013517  0.079377  0.071085  0.716268  0.052322  0.062158  0.005273

Distribution of the Mass Fractions

To finish, it is important once again to check the distribution of the mass fractions for each indivifual species.

CH4:

O2:

H2O:

N2:

CO:

CO2:

H2: